{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "71a329f0",
   "metadata": {},
   "source": [
    "# Deploy FlanT5-XL with dyanmic batching on SageMaker using TensorRT-LLM DLC "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cff05ea4",
   "metadata": {},
   "source": [
    "In this notebook, we deploy the open source Flan-T5-XL model across GPUs on a ml.g5.12xlarge instance for using TensorRT-LLM deep learning container. \n",
    "\n",
    "TensorRT-LLM has two separate software components Python runtime, and C++ runtime for Triton Inference Server. Whereas Python runtime has the components to build and execute the TensorRT runtime engine, but this implementation does not have the support for inflight/continuous batching. \n",
    "\n",
    "More details on [TensorRT-LLM github](https://github.com/NVIDIA/TensorRT-LLM/blob/118b3d7e7bab720d8ea9cd95338da60f7512c93a/docs/source/gpt_runtime.md)\n",
    "\n",
    "So in order to user T5 models in djl-trtllm DLC (as of 0.27.0), you can only use dyanmic batching for now. \n",
    "\n",
    "FlanT5 is available at [Huggingface Model Hub - Model Weights](https://huggingface.co/google/flan-t5-xl)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58b618e7-b049-4d44-8974-8fd1bbeb9a50",
   "metadata": {},
   "source": [
    "## Step 1: Install, import the required libraries; set some variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "67fa3208",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%pip install sagemaker boto3 awscli --upgrade  --quiet\n",
    "%pip install -U sagemaker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "2f217435",
   "metadata": {},
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "ec9ac353",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "import boto3\n",
    "import json\n",
    "import jinja2\n",
    "import sagemaker\n",
    "from pathlib import Path\n",
    "from sagemaker import Model, image_uris, serializers, deserializers\n",
    "\n",
    "#role = \"arn:aws:iam::125045733377:role/service-role/AmazonSageMaker-ExecutionRole-20240326T111860\"\n",
    "role = sagemaker.get_execution_role()  # execution role for the endpoint\n",
    "sess = sagemaker.session.Session()  # sagemaker session for interacting with different AWS APIs\n",
    "region = sess._region_name  # region name of the current SageMaker Studio environment\n",
    "account_id = sess.account_id()  # account_id of the current SageMaker Studio environment\n",
    "smr_client = boto3.client(\"sagemaker-runtime\")\n",
    "jinja_env = jinja2.Environment()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81deac79",
   "metadata": {},
   "source": [
    "## Step 2: Start preparing model artifacts\n",
    "In LMI container, we expect some artifacts to help setting up the model\n",
    "- serving.properties (required): Defines the model server settings\n",
    "\n",
    "Below files are optional\n",
    "- model.py (optional): A python file to define the core inference logic\n",
    "- requirements.txt (optional): Any additional pip wheel need to install\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb257a4b",
   "metadata": {},
   "source": [
    "### Low-code no-code experience\n",
    "\n",
    "We have recently introduced low-code, node-code (LCNC) experience to our users. You only need to specify the model_id and we figure out the other configurations for you. \n",
    "\n",
    "If you provide the huggingface model, the we set the default configurations as follows: \n",
    "1. option.dtype = fp32\n",
    "2. option.tensor_parallel_degree = <max-number-of-gpus>\n",
    "3. max_dynamic_batch_size=32 (same as batch_size)\n",
    "4. engine=MPI\n",
    "5. option.entryPoint=djl_python.tensorrt_llm\n",
    "    \n",
    "\n",
    "    \n",
    "    \n",
    "Unfortunately, as of now, this LCNC experience is only available for just in time compilation experience. That is, only when you provide the HuggingFace model and we compile the huggingface model to trt-engine format in runtime for you and then load those TensorRT engines for inference. \n",
    "    \n",
    "If you are providing the pre-comipled model, then you need to specify all those configurations in serving.properties yourself. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4d1b210",
   "metadata": {},
   "source": [
    "#### Just-in-time compilation (HF model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "b011bf5f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing serving.properties\n"
     ]
    }
   ],
   "source": [
    "%%writefile serving.properties\n",
    "#option.model_id = {{s3url}}\n",
    "option.model_id=google/flan-t5-xl\n",
    "#option.entryPoint = {{model_handler_url}}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81b09fd5",
   "metadata": {},
   "source": [
    "#### pre-compiled model (TRT engine format)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd1e4f1d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# %%writefile serving.properties\n",
    "# engine=MPI\n",
    "# option.model_id=google/flan-t5-xl\n",
    "# option.entryPoint=djl_python.tensorrt_llm\n",
    "# option.tensor_parallel_degree=max\n",
    "# option.dtype=fp32\n",
    "# max_dynamic_batch_size=8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ccc82315-1070-4a4f-8d5a-9b9e97e6acc5",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%sh\n",
    "mkdir flant5_trtllm \n",
    "mv serving.properties flant5_trtllm/\n",
    "rm -f flant5_trtllm.tar.gz\n",
    "tar czvf flant5-trtllm.tar.gz -C flant5_trtllm .\n",
    "rm -rf flant5_trtllm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2e58cf33",
   "metadata": {},
   "source": [
    "## Step 3: To create the end point the steps are:\n",
    "\n",
    "1. Create the Model using the Image container\n",
    "2. Create the endpoint config using the following key parameters\n",
    "\n",
    "    a) Instance Type is ml.g5.12xlarge\n",
    "    \n",
    "3. Create the end point using the endpoint config create\n",
    "\n",
    "#### Create the Model\n",
    "Use the image URI for the DJL container and the s3 location to which the tarball was uploaded.\n",
    "\n",
    "The container downloads the model into the `/tmp` space on the container because SageMaker maps the `/tmp` to the Amazon Elastic Block Store (Amazon EBS) volume that is mounted when we specify the endpoint creation parameter VolumeSizeInGB. It leverages `s5cmd`(https://github.com/peak/s5cmd) which offers a very fast download speed and hence extremely useful when downloading large models.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4d955679",
   "metadata": {},
   "source": [
    "### 3.1 Getting the container image URI\n",
    "\n",
    "[Deep learning containers for large model inference](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#large-model-inference-containers) has more details about each version and framework in the container."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "7a174b36",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# image_uri = image_uris.retrieve(\n",
    "#         framework=\"djl-tensorrtllm\",\n",
    "#         region=sess.boto_session.region_name,\n",
    "#         version=\"0.27.0\"\n",
    "#     )\n",
    "\n",
    "image_uri = \"125045733377.dkr.ecr.us-west-2.amazonaws.com/djl-serving:tensorrt-llm-nightly\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "072c58a5-28bf-4748-abb8-12fe43b4e3d4",
   "metadata": {},
   "source": [
    "### 3.2 Upload artifact on S3 and create SageMaker model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "38b1e5ca",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "S3 Code or Model tar ball uploaded to --- > s3://sagemaker-us-west-2-125045733377/flan-t5-trt-llm//flant5-trtllm.tar.gz\n"
     ]
    }
   ],
   "source": [
    "s3_code_prefix = \"flan-t5-trt-llm/\"\n",
    "bucket = sess.default_bucket()  # bucket to house artifacts\n",
    "code_artifact = sess.upload_data(\"flant5-trtllm.tar.gz\", bucket, s3_code_prefix)\n",
    "print(f\"S3 Code or Model tar ball uploaded to --- > {code_artifact}\")\n",
    "env = {\"HUGGINGFACE_HUB_CACHE\": \"/tmp\", \"TRANSFORMERS_CACHE\": \"/tmp\"}\n",
    "\n",
    "model = Model(image_uri=image_uri, model_data=code_artifact, env=env, role=role)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "004f39f6",
   "metadata": {},
   "source": [
    "### 3.3 Create SageMaker endpoint\n",
    "\n",
    "You need to specify the instance to use and endpoint names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e0e61cd",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "instance_type = \"ml.g5.12xlarge\"\n",
    "endpoint_name = sagemaker.utils.name_from_base(\"djl-trtllm-g5-fp32-flan-t5-xl\")\n",
    "print(endpoint_name)\n",
    "\n",
    "model.deploy(\n",
    "  initial_instance_count=1,\n",
    "  instance_type=instance_type,\n",
    "  endpoint_name=endpoint_name,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "61702132",
   "metadata": {},
   "outputs": [],
   "source": [
    "# our requests and responses will be in json format so we specify the serializer and the deserializer\n",
    "predictor = sagemaker.Predictor(\n",
    "    endpoint_name=endpoint_name,\n",
    "    sagemaker_session=sess,\n",
    "    serializer=serializers.JSONSerializer(),\n",
    "    deserializer=deserializers.JSONDeserializer(),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73c96600-56f1-44b9-af9f-d396cc8f4189",
   "metadata": {},
   "source": [
    "## Step 4: Invoke the endpoint. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "65ce0c86",
   "metadata": {},
   "outputs": [],
   "source": [
    "input_prompt = \"translate English to German: The house is wonderful.\"\n",
    "start = time.time()\n",
    "response = predictor.predict(\n",
    "    {\n",
    "        \"inputs\": input_prompt,\n",
    "        \"parameters\": {\"max_new_tokens\": 1024, \"details\": True},\n",
    "    }\n",
    ")\n",
    "end = time.time()\n",
    "print(f\"Time elapsed {end - start}\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acdcebea",
   "metadata": {},
   "source": [
    "## Clean up the environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "125e9f2f",
   "metadata": {},
   "outputs": [],
   "source": [
    "sess.delete_endpoint(endpoint_name)\n",
    "sess.delete_endpoint_config(endpoint_name)\n",
    "model.delete_model()"
   ]
  }
 ],
 "metadata": {
  "availableInstances": [
   {
    "_defaultOrder": 0,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.t3.medium",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 1,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.t3.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 2,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.t3.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 3,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.t3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 4,
    "_isFastLaunch": true,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 5,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 6,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 7,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 8,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 9,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 10,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 11,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 12,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.m5d.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 13,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.m5d.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 14,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.m5d.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 15,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.m5d.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 16,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.m5d.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 17,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.m5d.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 18,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.m5d.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 19,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.m5d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 20,
    "_isFastLaunch": false,
    "category": "General purpose",
    "gpuNum": 0,
    "hideHardwareSpecs": true,
    "memoryGiB": 0,
    "name": "ml.geospatial.interactive",
    "supportedImageNames": [
     "sagemaker-geospatial-v1-0"
    ],
    "vcpuNum": 0
   },
   {
    "_defaultOrder": 21,
    "_isFastLaunch": true,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 4,
    "name": "ml.c5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 22,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 8,
    "name": "ml.c5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 23,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.c5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 24,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.c5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 25,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 72,
    "name": "ml.c5.9xlarge",
    "vcpuNum": 36
   },
   {
    "_defaultOrder": 26,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 96,
    "name": "ml.c5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 27,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 144,
    "name": "ml.c5.18xlarge",
    "vcpuNum": 72
   },
   {
    "_defaultOrder": 28,
    "_isFastLaunch": false,
    "category": "Compute optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.c5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 29,
    "_isFastLaunch": true,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g4dn.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 30,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g4dn.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 31,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g4dn.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 32,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g4dn.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 33,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g4dn.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 34,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g4dn.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 35,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 61,
    "name": "ml.p3.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 36,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 244,
    "name": "ml.p3.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 37,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 488,
    "name": "ml.p3.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 38,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.p3dn.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 39,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.r5.large",
    "vcpuNum": 2
   },
   {
    "_defaultOrder": 40,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.r5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 41,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.r5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 42,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.r5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 43,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.r5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 44,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.r5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 45,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 512,
    "name": "ml.r5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 46,
    "_isFastLaunch": false,
    "category": "Memory Optimized",
    "gpuNum": 0,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.r5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 47,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 16,
    "name": "ml.g5.xlarge",
    "vcpuNum": 4
   },
   {
    "_defaultOrder": 48,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 32,
    "name": "ml.g5.2xlarge",
    "vcpuNum": 8
   },
   {
    "_defaultOrder": 49,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 64,
    "name": "ml.g5.4xlarge",
    "vcpuNum": 16
   },
   {
    "_defaultOrder": 50,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 128,
    "name": "ml.g5.8xlarge",
    "vcpuNum": 32
   },
   {
    "_defaultOrder": 51,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 1,
    "hideHardwareSpecs": false,
    "memoryGiB": 256,
    "name": "ml.g5.16xlarge",
    "vcpuNum": 64
   },
   {
    "_defaultOrder": 52,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 192,
    "name": "ml.g5.12xlarge",
    "vcpuNum": 48
   },
   {
    "_defaultOrder": 53,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 4,
    "hideHardwareSpecs": false,
    "memoryGiB": 384,
    "name": "ml.g5.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 54,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 768,
    "name": "ml.g5.48xlarge",
    "vcpuNum": 192
   },
   {
    "_defaultOrder": 55,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4d.24xlarge",
    "vcpuNum": 96
   },
   {
    "_defaultOrder": 56,
    "_isFastLaunch": false,
    "category": "Accelerated computing",
    "gpuNum": 8,
    "hideHardwareSpecs": false,
    "memoryGiB": 1152,
    "name": "ml.p4de.24xlarge",
    "vcpuNum": 96
   }
  ],
  "kernelspec": {
   "display_name": "conda_tensorflow2_p310",
   "language": "python",
   "name": "conda_tensorflow2_p310"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
